When Data Goes Missing: Methods for Missing Score Imputation in Biometric Fusion
نویسندگان
چکیده
While fusion can be accomplished at multiple levels in a multibiometric system, score level fusion is commonly used as it offers a good trade-off between fusion complexity and data availability. However, missing scores affect the implementation of several biometric fusion rules. While there are several techniques for handling missing data, the imputation scheme which replaces missing values with predicted values is preferred since this scheme can be followed by a standard fusion scheme designed for complete data. This paper compares the performance of three imputation methods: Imputation via Maximum Likelihood Estimation (MLE), Multiple Imputation (MI) and Random Draw Imputation through Gaussian Mixture Model estimation (RD GMM). A novel method called Hot-deck GMM is also introduced and exhibits markedly better performance than the other methods because of its ability to preserve the local structure of the score distribution. Experiments on the MSU dataset indicate the robustness of the schemes in handling missing scores at various missing data rates.
منابع مشابه
A comparison of imputation methods for handling missing scores in biometric fusion
Multibiometric systems, which consolidate or fuse multiple sources of biometric information, typically provide better recognition performance than unimodal systems. While fusion can be accomplished at various levels in a multibiometric system, score-level fusion is commonly used as it offers a good tradeoff between data availability and ease of fusion. Most score-level fusion rules assume that ...
متن کاملInfluence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons
Background Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملکاربرد جای گذاری چندگانه در تحقیقات پزشکی و اپیدمیولوژی
Data missing, which occurs for different reasons, is an unavoidable problem in epidemiological studies. It is quite widespread and, therefore, it is considered as a challenge in research design and data analysis by many methodologists. Complete case analysis is often used in studies with missing data however, this approach may result in inaccurate estimates and inferences due to bias associated...
متن کامل